DNN-based SRE Systems in Multi-Language Conditions — Technical Report

نویسندگان

  • Ondřej Novotný
  • Pavel Matějka
  • Ondřej Glembek
  • Oldřich Plchot
  • Frantǐsek Grézl
  • Lukáš Burget
  • Jan “Honza” Černocký
چکیده

This work studies the usage of the (currently state-of-the-art) Deep Neural Networks (DNN) i-vector/PLDA-based speaker recognition systems in multi-language (especially non-English) conditions. On the “Language Pack” of the PRISM set, we evaluate the systems’ performance using NIST’s standard metrics. We study the use of multi-lingual DNN in place of the original English DNN on these multi-language conditions. We show that not only the gain from using DNNs vanishes, but also the DNN-based systems tend to produce de-calibrated scores under the studied conditions. This work gives suggestions for directions of future research rather than any particular solutions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Investigation of bottleneck features and multilingual deep neural networks for speaker verification

Recently, the integration of deep neural networks (DNNs) with i-vector systems is proved to be effective for speaker verification. This method uses the DNN with senone outputs to produce frame alignments for sufficient statistics extraction. However, two types of data mismatch may degrade the performance of the DNN-based speaker verification systems. First, the DNN requires transcribed training...

متن کامل

A deep neural network speaker verification system targeting microphone speech

We recently proposed the use of deep neural networks (DNN) in place of Gaussian Mixture models (GMM) in the i-vector extraction process for speaker recognition. We have shown significant accuracy improvements on the 2012 NIST speaker recognition evaluation (SRE) telephone conditions. This paper explores how this framework can be effectively used on the microphone speech conditions of the 2012 N...

متن کامل

Improving Deep Neural Networks Based Speaker Verification Using Unlabeled Data

Recently, deep neural networks (DNNs) trained to predict senones have been incorporated into the conventional i-vector based speaker verification systems to provide soft frame alignments and show promising results. However, the data mismatch problem may degrade the performance since the DNN requires transcribed data (out-domain data) while the data sets (indomain data) used for i-vector trainin...

متن کامل

UTD-CRSS Systems for 2016 NIST Speaker Recognition Evaluation

This study describes systems submitted by the Center for Robust Speech Systems (CRSS) from the University of Texas at Dallas (UTD) to the 2016 National Institute of Standards and Technology (NIST) Speaker Recognition Evaluation (SRE). We developed 4 UBM and DNN i-vector based speaker recognition systems with alternate data sets and feature representations. Given that the emphasis of the NIST SR...

متن کامل

Speaker Verification Using Short Utterances with DNN-Based Estimation of Subglottal Acoustic Features

Speaker verification in real-world applications sometimes deals with limited duration of enrollment and/or test data. MFCC-based i-vector systems have defined the state-of-the-art for speaker verification, but it is well known that they are less effective with short utterances. To address this issue, we propose a method to leverage the speaker specificity and stationarity of subglottal acoustic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016